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Abstract 

We  discuss  methods  for  a  priori  selection  of  parameters  to  be  estimated  in  inverse  problem 
formulations  (such  as  Maximum  Likelihood,  Ordinary  and  Generalized  Least  Squares)  for 
dynamical  systems  with  numerous  state  variables  and  an  even  larger  number  of  param¬ 
eters.  We  illustrate  the  ideas  with  an  in-host  model  for  HIV  dynamics  which  has  been 
successfully  validated  with  clinical  data  and  used  for  prediction  and  a  model  for  the 
reaction  of  the  cardiovascular  system  to  an  ergometric  workload. 
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1  Introduction 


There  are  many  topics  of  great  importance  and  interest  in  the  areas  of  modeling  and  inverse 
problems  which  are  properly  viewed  as  essential  in  the  use  of  mathematics  and  statistics  in 
scientific  inquiries.  A  brief,  noninclusive  list  of  topics  include  the  use  of  traditional  sensitivity 
functions  (TSF)  and  generalized  sensitivity  functions  (GSF)  in  experimental  design  (what 
type  and  how  much  data  is  needed,  where/when  to  take  observations)  [9,  10,  11,  16,  57], 
choice  of  mathematical  models  and  their  parameterizations  (verification,  validation,  model 
selection  and  model  comparison  techniques)  [7,  12,  13,  17,  21,  22,  24,  25,  41],  choice  of 
statistical  models  (observation  process  and  sampling  errors,  residual  plots  for  statistical  model 
verification,  use  of  asymptotic  theory  and  bootstrapping  for  computation  of  standard  errors, 
confidence  intervals)  [7,  14,  30,  31,  55,  56],  choice  of  cost  functionals  (MLE,  OLS,  WLS,  GLS, 
etc.)  [7,  30],  as  well  as  parameter  identifiability  and  selectivity.  There  is  extensive  literature 
on  each  of  these  topics  and  many  have  been  treated  in  surveys  in  one  form  or  another  ([30]  is 
an  excellent  monograph  with  many  references  on  the  statistically  related  topics)  or  in  earlier 
lecture  notes  [7]. 

We  discuss  here  an  enduring  major  problem:  selection  of  those  model  parameters  which 
can  be  readily  and  reliably  (with  quantifiable  uncertainty  bounds)  estimated  in  an  inverse 
problem  formulation.  This  is  especially  important  in  many  areas  of  biological  modeling  where 
often  one  has  large  dynamical  systems  (many  state  variables),  an  even  larger  number  of 
unknown  parameters  to  be  estimated  and  a  paucity  of  longitudinal  time  observations  or  data 
points.  As  biological  and  physiological  models  (at  the  cellular,  biochemical  pathway  or  whole 
organism  level)  become  more  sophisticated  (motivated  by  increasingly  detailed  understanding 
-  or  lack  thereof  -  of  mechanisms),  it  is  becoming  quite  common  to  have  large  systems 
(10-20  or  more  differential  equations),  with  a  plethora  of  parameters  (25-100)  but  only  a 
limited  number  (50  -100  or  fewer)  of  data  points  per  individual  organism.  For  example,  we 
find  models  for  the  cardiovascular  system  [16,  Chapter  1]  (where  the  model  has  10  state 
variables  and  22  parameters)  and  [51,  Chapter  6]  (where  the  model  has  22  states  and  55 
parameters),  immunology  [49]  (8  states,  24  parameters),  metabolic  pathways  [32]  (8  states, 
35  parameters)  and  HIV  progression  [8,  43].  Fortunately,  there  is  a  growing  recent  effort 
among  scientists  to  develop  quantitative  methods  based  on  sensitivity,  information  matrices 
and  other  statistical  constructs  (see  for  example  [9,  10,  11,  23,  28,  37,  38,  60])  to  aid  in 
identification  or  parameter  estimation  formulations.  We  discuss  here  one  approach  using 
sensitivity  matrices  and  asymptotic  standard  errors  as  a  basis  for  our  developments.  To 
illustrate  our  discussions,  we  will  use  two  models  from  the  biological  sciences:  a)  a  recently 
developed  in-host  model  for  HIV  dynamics  which  has  been  successfully  validated  with  clinical 
data  and  used  for  prediction  [4,  8];  b)  a  global  non-pulsatile  model  for  the  cardiovascular 
system  which  has  been  validated  with  data  from  bicycle  ergometer  tests  [45,  16]. 

The  topic  of  system  and  parameter  identifiability  is  actually  an  old  one.  In  the  context 
of  parameter  determination  from  system  observations  or  output  it  is  at  least  forty  years  old 
and  has  received  much  attention  in  the  peak  years  of  linear  system  and  control  theory  in  the 
investigation  of  observability,  controllability  and  detectability  [6,  18,  19,  33,  39,  44,  47,  53,  54], 
These  early  investigations  and  results  were  focused  primarily  on  engineering  applications, 
although  much  interest  in  other  areas  (e.g.,  oceanography,  biology)  has  prompted  more  recent 
inquiries  for  both  linear  and  nonlinear  dynamical  systems  [5,  15,  29,  35,  42,  48,  59,  60,  61,  62]. 
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2  Statistical  Models  for  the  Observation  Process 


One  has  errors  in  any  data  collection  process  and  the  presence  of  these  errors  is  reflected  in 
any  parameter  estimation  result  one  might  obtain.  To  understand  and  treat  this,  one  usually 
specifies  a  statistical  model  for  the  observation  process  in  addition  to  the  mathematical  model 
representing  the  dynamics.  To  illustrate  ideas  here  we  use  ordinary  least  squares  (OLS) 

consistent  with  an  error  model  for  absolute  error  in  the  observations.  For  a  discussion  of 

other  frameworks  (maximum  likelihood  in  the  case  of  known  error  distributions,  generalized 
least  squares  appropriate  for  relative  error  models)  see  [7]. 

In  order  to  be  more  specific  we  assume  that  the  dynamics  of  the  system  is  modeled  by  a 
system  of  ordinary  differential  equations: 

x(t)  =  g(t,x(t),9),  t>t0,  x(t0)  =  x o(0),  (1) 

z(t)  =  h(t,x(t),6),  t>to,  0  e  A,  (2) 

where  G  C  Mn  and  dcff  are  open  sets  and  g  :  [to,  oo)  x  G  x  A  — ■>  Mn,  xq  :  A  — > ►  Mn  and 
h  :  [fo,oo)  xGxd-^l  are  sufficiently  smooth  functions.  The  set  A  is  called  the  set  of 
admissible  parameters  and  z(-)  is  the  measurable  output  of  the  system,  which  for  simplicity 
we  assume  to  be  scalar.  Let  x(t)  =  x(t:  0)  denote  the  solution  of  (1)  for  given  6  E  A  and  set 

f(t,9)  =  h(t,x(t;9),9),  t>to,  9  £  A. 


Then 


z(t)  =  f(t ;  9),  t>t0,  9  <E  A, 


(3) 


is  the  output  model  corresponding  to  the  model  (1),  (2).  It  is  clear  that  an  output  model 
of  the  form  (3)  can  also  originate  from  dynamical  models,  where  instead  of  the  ODE-system 
(1)  we  may  have  a  system  of  delay  equations  or  some  partial  differential  equation.  In  order 
to  describe  the  observation  process  we  assume  there  exists  a  vector  9q  G  A,  referred  to  as  the 
true  or  nominal  parameter  vector,  for  which  the  output  z(t)  =  f(t,  9 q)  describes  the  output 
of  the  real  system  exactly.  At  given  sampling  times 


to  <t  !<•••<  tjv, 

we  have  measurements  yj  for  the  output  of  the  real  system,  j  =  1, . . . ,  N.  It  is  also  reasonably 
assumed  that  each  of  the  N  longitudinal  measurements  yj  is  affected  by  random  deviations 
€j  from  the  true  underlying  output.  That  is,  we  assume  that  the  measurements  are  given  by 

Vj  =  f(tj‘,9o)  +  ej:  j  =  1, . . . ,  N.  (4) 

The  measurement  errors  ej  are  assumed  to  be  realizations  of  random  variables  £j  satisfying 
the  following  assumptions: 

(i)  the  errors  £j  have  mean  zero,  E  (£j)  =  0  ; 

(ii)  the  errors  £j  have  finite  common  variance,  var (£j)  =  Oq  <  oo ; 

(iii)  the  errors  £j  are  independent  (i.e. ,  co v(£j,£k)  =  0  whenever  j  A  k)  and  identically 
distributed. 
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According  to  (4)  the  measurements  yj  are  realizations  of  random  variables  Yj,  the  observations 
at  the  sampling  times  tj.  Then  the  statistical  model  for  the  scalar  observation  process  is 

Yj  =  f{tj-e  o)  +  Sj,  j  =  l,...,N.  (5) 


Assumptions  (i)  -  (iii)  imply  that  the  mean  of  the  observations  is  equal  to  the  model  output 
for  the  nominal  parameter  vector,  E(l^)  =  f(tj ;  do),  and  the  variance  in  the  observations  is 
constant  in  time,  var (Yj)  =  <7q,  j  =  1, . . . ,  N. 

For  given  measurements  y  =  (y\, . . .  ,  y./v)T  the  estimate  doLS  for  do  is  obtained  by  mini¬ 
mizing 

N 

J(y ,  0)  =  -  f(tj ;  d))2  =  I y-  F(6) I2  =  I F{6)  -  F(90 )  -  e|2,  (6) 

3= 1 

where  we  have  set 

m  =  (fit  i;  d), . . . ,  f{tN-  d))T,  e  =  (d,  •  •  • ,  C7v)T, 

and  |  •  |  is  the  Euclidean  norm  in  M^.  The  estimate  doLS  is  a  realization  of  a  random 
variable,  the  least  squares  estimator  ©ols-  In  order  to  indicate  the  dependence  on  N  we 
shall  write  9qLS  and  ©ols  when  needed.  From  [55]  we  find  that  under  a  number  of  regularity 
and  sampling  conditions,  as  N  — »•  oo,  ©qls  is  approximately  distributed  according  to  a 
multivariate  normal  distribution,  i.e., 

©OLS-A/^do,^),  (7) 

where  Eq  =  Uq  (ATI o)  1  £  Mpxp  and 

=  Jim  i-TXN(8o)TXN(0o)- 

N^oo  iv 


The  N  x  p  matrix  xN{@)  is  known  as  the  sensitivity  matrix  of  the  system,  and  is  defined  as 


df(U;  d0) 


dOj 


l<i<N,  1  <j<p 


dF 

~de 


m 


V0F(do). 


(8) 


Asymptotic  theory  [7,  30,  55]  requires  existence  and  non-singularity  of  XIq.  The  p  x  p  matrix 
Y,q  is  the  covariance  matrix  of  the  estimator  @N . 

If  the  output  model  (3)  corresponds  to  the  model  (1),  (2)  then  the  derivatives  of  /  with 
respect  to  the  parameters  are  given  by 


3/ 

dOj 


(t,d) 


dh 

dx 


(t,x(t;0),e) 


dx 


(t‘,  d)  + 


dh 

dfj 


(t,x(t;0),6),  j  =  l,...,p, 


where  w(t;  6)  =  ({dx / d6\)(t\  d), . .  , ,  ( dx/dOp)(t ;  d))  6  RNxp  is  obtained  by  solving 


x(t;  9)  =  g(t ,  x(t;  9),  d),  x(t0;  9)  =  x0(d), 

w(t,  9)  =  (t,  x(f;  9),9)w(t ;  d)  +  x(t;  d),  d),  w(t0;  d)  = 


(9) 


(10) 


from  t  =  to  to  t  =  t]\r.  One  could  alternatively  obtain  the  sensitivity  matrix  using  differ¬ 
ence  quotients  (usually  less  accurately)  or  by  using  automated  differentiation  software  (for 
additional  details  on  sensitivity  matrix  calculations  see  [7,  9,  27,  28,  34,  36]). 
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The  estimate  #ols  =  ^ols  a  realization  of  the  estimator  ©ols>  and  is  calculated  using  a 
realization  {yi}^1  of  the  observation  process  while  minimizing  (6)  over  9.  Moreover, 

the  estimate  #ols  is  used  in  the  calculation  of  the  sampling  distribution  for  the  parameters. 
The  generally  unknown  error  variance  <7q  is  approximated  by 

1  N 

^OLS  =  N  -p  Vj  ~  f^i^OLs))  >  i11) 

F  3= 1 

while  the  covariance  matrix  is  approximated  by 

^OLS  =  ^OLS  (x7V(^OLs)TX7V(^OLs)^)  •  (12) 

As  discussed  in  [7,  30,  55]  an  approximate  for  the  sampling  distribution  of  the  estimator 
is  given  by 

©OLS  =  ©OLS  ~  KVo,  Xo)  ~  -^p(^OLS)  ^OLs)-  (13) 

Asymptotic  standard  errors  can  be  used  to  quantify  uncertainty  in  the  estimation,  and  they 
are  calculated  by  taking  the  square  roots  of  the  diagonal  elements  of  the  covariance  matrix 
^OLS’  i-e-> 

SEfc(Cs)=  \/(S£Lsk*.  k  =  h...,p.  (14) 

Before  describing  the  algorithm  in  detail  and  illustrating  its  use,  we  provide  some  moti¬ 
vation  underlying  the  use  of  the  sensitivity  matrix  x($o)  of  (8)  and  the  Fisher  Information 
Matrix  F(9q)  =  (1/ctq)xT(#o)x($o)-  Both  of  these  matrices  play  a  fundamental  role  in  the 
development  of  the  approximate  asymptotic  distributional  theory  resulting  in  (13)  and  (14). 
Since  a  prominent  measure  of  the  ability  to  estimate  a  parameter  is  related  to  its  associated 
standard  errors  in  estimation,  it  is  worthwhile  to  briefly  outline  the  underlying  approximation 
ideas  in  the  asymptotic  distributional  theory. 

Ordinary  least  squares  problems  involve  choosing  ©  =  ©ols  to  minimize  the  difference 
between  observations  Y  and  model  output  F(9),  i.e.,  minimize  | Y  —  F{9) |.  However  the 
approximate  asymptotic  distributional  theory  (e.g.,  see  [55,  Chapter  12])  which  is  exact  for 
model  outputs  linear  in  the  parameters,  employs  a  fundamental  linearization  in  the  param¬ 
eters  in  a  neighborhood  of  the  hypothesized  “true”  parameters  9q.  Replacing  the  model 
output  with  a  first  order  linearization  about  9q,  we  then  may  seek  to  minimize  for  9  in  the 
approximate  functional 

\Y-F(90)-VeF(90)[9-90}\. 

If  we  use  the  statistical  model  Y  =  F(9q)  +  £  and  let  59  =  6  —  9q,  we  thus  wish  to  minimize 

\£~X(90)59\, 

where  x($o)  =  V gF(9o )  is  the  N  x  p  sensitivity  matrix  defined  in  (8).  This  is  a  standard 
optimization  problem  [46,  Section  6.11]  whose  solution  can  be  given  using  the  pseudo  inverse 
x(#o)^  defined  in  terms  of  minimal  norm  solutions  of  the  optimization  problem  and  satisfying 
x(#o)t  =  (x(#o)Tx(6'o))tx(#o)T  =  cro-^"(^o)tx($o)T-  The  solution  is 

50  =  X(8o)'£ 


or 

©LIN  =  9o  +  x(0o)f£  =  00  +  4F(90)h(9o)T£- 
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(15) 


If  J-(Oq)  is  invertible,  then  the  solution  (to  first  order)  of  the  OLS  problem  is 

@OLS  ~  @LIN  =  90  +  <7o-^($o)~1X(#o)T£- 

This  approximation,  for  which  the  asymptotic  distributional  theory  is  exact,  can  be  a  rea¬ 
sonable  one  for  use  in  developing  an  approximate  nonlinear  asymptotic  theory  if  <5©  is  small, 
i.e. ,  if  the  OLS  estimated  parameter  is  close  to  9q. 

From  these  calculations,  we  see  that  the  rank  of  x(#o)  and  the  conditioning  (or  ill- 
conditioning)  of  J-(Oq)  play  a  significant  role  in  solving  OLS  inverse  problems  as  well  as 
in  any  asymptotic  standard  error  formulations  based  on  this  linearization.  Observe  that  the 
error  (or  noise)  £  in  the  data  will  in  general  be  amplified  as  the  ill-conditioning  of  T  in¬ 
creases.  We  further  note  that  the  N  x  p  sensitivity  matrix  x($o)  is  of  full  rank  p  if  and  only 
if  the  px  p  Fisher  matrix  J-(9q)  has  rank  p,  or  equivalently,  is  nonsingular.  These  underlying 
considerations  have  motivated  a  number  of  efforts  (e.g.,  see  [9,  10,  11])  on  understanding  the 
conditioning  of  the  Fisher  matrix  as  a  function  of  the  number  N  and  longitudinal  locations 
{tj }jLi  of  data  points  as  a  key  indicator  for  well-formulated  inverse  problems  and  as  a  tool 
in  optimal  design,  especially  with  respect  to  computation  of  uncertainty  (standard  errors, 
confidence  intervals)  for  parameter  estimates. 

In  view  of  the  considerations  above  (which  are  very  local  in  nature  -  both  the  sensi¬ 
tivity  matrix  and  the  Fisher  information  matrix  are  taken  at  the  nominal  vector  9q),  one 
should  be  pessimistic  about  using  these  quantities  to  obtain  any  nonlocal  selection  methods 
or  criteria  for  estimation.  Indeed,  for  nonlinear  complex  systems,  it  is  easy  to  argue  that 
questions  related  to  some  type  of  global  parameter  identifiability  are  not  fruitful  questions 
to  be  pursuing. 

3  Subset  Selection  Algorithm 

The  focus  of  our  presentation  here  is  how  one  chooses  a  priori  (i.e.,  before  any  inverse  problem 
calculations  are  carried  out)  which  parameters  can  be  readily  estimated  with  a  typical  longi¬ 
tudinal  data  set.  We  illustrate  an  algorithm,  developed  recently  in  [28],  to  select  parameter 
vectors  that  can  be  estimated  from  a  given  data  set  using  an  ordinary  least  squares  inverse 
problem  formulation  (similar  ideas  apply  if  one  is  using  a  relative  error  statistical  model  and 
generalized  least  squares  formulations).  Let  q  G  Mpo  be  the  parameter  vectors  being  at  our 
disposal  for  parameter  estimation  and  denote  by  qo  G  MPo  the  vector  of  the  corresponding 
nominal  values.  Given  a  number  p  <  po  of  parameters  we  wish  to  identify,  the  algorithm 
searches  all  possible  choices  of  p  different  parameters  among  the  po  parameters  and  selects 
the  one  which  is  identifiable  (i.e.,  the  corresponding  sensitivity  matrix  has  full  rank  p)  and 
minimizes  a  given  uncertainty  quantification  (e.g.,  by  means  of  asymptotic  standard  errors). 
Prior  knowledge  of  a  nominal  set  of  values  for  all  parameters  along  with  the  observation 
times  for  data  (but  not  the  values  of  the  observations)  will  be  required  for  our  algorithm.  For 
p  <  po  we  set 

Sp  =  {0  G  Rp|  6  is  a  sub- vector  of  q  G  Mpo}, 

i.e.,  6  G  Sp  is  given  as  9  =  (qn , . . . ,  qip)T  for  some  1  <  h  <  ■  ■  ■  <  ip  <  po ■  The  corresponding 
nominal  vector  is  90  =  {(qo)h,  ■  ■  • ,  (<?o)*p)T- 

As  we  have  stated  above,  to  apply  the  parameter  subset  selection  algorithm  we  require 
prior  knowledge  of  nominal  variance  and  nominal  parameter  values.  These  nominal  values 
of  ao  and  9q  are  needed  to  calculate  the  sensitivity  matrix,  the  Fisher  matrix  and  the  corre¬ 
sponding  covariance  matrix  defined  in  (12).  For  our  illustration  presented  in  Section  5,  we 
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use  the  variance  and  parameter  estimates  obtained  in  previous  investigations  of  the  models  as 
nominal  values.  In  problems  for  which  no  prior  estimation  has  been  carried  out,  one  must  use 
knowledge  of  the  observation  process  error  and  some  knowledge  of  viable  parameter  values 
that  might  be  reasonable  with  the  model  under  investigation. 

The  uncertainty  quantification  we  shall  use  is  based  on  the  considerations  given  in  the 
previous  section.  Let  0  £  be  given.  As  an  approximation  to  the  covariance  matrix  of  the 
estimator  for  6  we  take 


£(0O)  =  °o{x(0o)Tx(0o))  1  =  .mr1. 


We  introduce  the  coefficients  of  variation  for  0 


Vi{0  0) 


(£(0o)v)1/2 

(0o  )i 


*  =  !,•••,  P, 


(16) 


and  take  as  a  uncertainty  quantification  for  the  estimates  of  9  the  selection  score  given  by 
the  Euclidean  norm  in  6  £  Mp  of  v(9q),  i.e., 


Oi(9  o)  =  W(9o)\, 

where  v(Qq)  =  (yi(9o),  ■  ■  ■ ,  isp(9o))J  ■  The  components  of  the  vector  u(9q)  are  the  ratios  of  each 
standard  error  for  a  parameter  to  the  corresponding  nominal  parameter  value.  These  ratios 
are  dimensionless  numbers  warranting  comparison  even  when  parameters  have  considerably 
different  scales  and  units  (e.g.,  in  case  of  the  HIV-model  Nt  is  on  the  order  of  101,  while 
k\  is  on  the  order  of  ICE6,  whereas  in  case  of  the  CVS-model  we  have  C(  on  the  order  ICE2 
and  Apggji  on  the  order  102).  A  selection  score  a  (do)  near  zero  indicates  lower  uncertainty 
possibilities  in  the  estimation,  while  large  values  of  a(9o)  suggest  that  one  could  expect  to  find 
substantial  uncertainty  in  at  least  some  of  the  components  of  the  estimates  in  any  parameter 
estimation  attempt. 

Let  To  be  the  Fisher  information  matrix  corresponding  to  the  parameter  vectors  q  £  MPo 
and  Tp  the  Fisher  information  matrix  corresponding  to  the  parameter  vectors  in  8  £  Sp. 
Then  rank To(qo)  =  Po  implies  that  rank  JFp(0o]  =  p  for  any  6  £  Sp.  p  <  po,  i.e.,  if  To(qo)  is 
non-singular  then  also  all  Tp(9o)  are  non-singular  for  all  p  <  po  and  all  do  corresponding  to  a 
6  £  Sp.  Moreover,  if  rank  JF0(^0)  =  p\  with  p\  <  po,  then  rank T(6o)  <  p  for  all  p\  <  p  <  po 
and  all  0  £  Sp. 

On  the  basis  of  the  considerations  given  above  our  algorithm  proceeds  as  follows: 


Selection  Algorithm.  Given  p  <  po  the  algorithm  considers  all  possible  choices  of  indices 
ii, . . .  ,ip  with  1  <  i±  <  ■  ■  ■  <  ip  <  po  in  lexicographical  ordering  starting  with  the  first  choice 
(4^,  •  •  • ,  i^)  =  (1 , ,p)  and  completes  the  following  steps: 

Initializing  step:  Set  indsel  =  (1 , ...  ,  p)  and  asel  =  oo. 

Step  k:  For  the  choice  (i{\  ■  ■  ■  ,4^)  compute  r  =  rank  T  ((qo) -(k) , . .  ■ ,  (<?o)-(*o))- 
If  r  <  p,  go  to  Step  k  +  1 . 

Ifr  =  p,  compute  ak  =  a((g0) m,  .  •  • ,  (go)-w))- 

<'1  I'p 

If  o-k  h  nsel,  go  to  Step  k  +  1. 

If  ak  <  asel,  set  indsel  =  (i[k  \  . . . ,  4^)>  aSel  =  ak  and  go  to  Step  k  +  1. 

Following  the  initializing  step  the  algorithm  performs  (p°)  steps  and  provides  the  index 
vector  indsel  =  (i^...,i*)  which  gives  the  sub-vector  6*  =  (qq,...,  qi*)T  such  that  the 
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selection  score  •  •  • ,  (qo)i*)  is  minimal  among  all  possible  choices  of  sub-vectors  in  Sp. 

If  rank  Epo  =  po  then  the  rank  test  in  Step  k  can  be  canceled,  of  course. 

4  Models 

In  the  following  we  shall  illustrate  the  parameter  selection  ideas  with  results  obtained  by  use 
of  the  subset  selection  algorithm  described  in  the  previous  section  for  two  specific  models. 
These  models  have  a  moderate  number  of  parameters  to  be  identified  yet  are  sufficiently 
complex  to  make  a  trial-error  approach  unfeasible. 

4.1  A  Mathematical  Model  for  HIV  Progression  with  Treatment  Interrup¬ 
tion 

As  our  first  illustrative  example  we  use  one  of  the  many  dynamic  models  for  HIV  progression 
found  in  an  extensive  literature  (e.g.,  see  [1,  2,  3,  4,  8,  20,  26,  50,  52,  58]  and  the  many 
references  therein).  For  our  example  model,  the  dynamics  of  in- host  HIV  is  described  by  the 
interactions  between  uninfected  and  infected  type  1  target  cells  (Tf  and  T*)  (CD4+  T-cells), 
uninfected  and  infected  type  2  target  cells  (T2  and  T2 )  (such  as  macrophages  or  memory  cells, 
etc.),  infectious  free  virus  Vj,  and  immune  response  E  (cytotoxic  T-lymphocytes  CD8+)  to 
the  infection.  This  model,  which  was  developed  and  studied  in  [1,  4]  and  later  extended  in 
subsequent  efforts  (e.g.,  see  [8]),  is  essentially  one  suggested  in  [26],  but  includes  an  immune 
response  compartment  and  dynamics  as  in  [20].  The  model  equations  are  given  by 


Ti  =  Ai  -  d\  T\  -  (1  -  ei(t))A:iV/Ti, 

T\  =  (1  -  ei(f))AqV/Ti  -  ST*  -  mxET*x, 
T2  =  \2-  d2T2  -  (1  -  fe i(t))k2V!T2i 


H  =  (1  -  fei(t))k2VIT2  -  STf  -  rn2ET*7 

V7  =  (1  -  e2(t))l03AT5(T1*  +  T2*)  -  cVr 

-  (1  -  ei (l))l03hTih  -  (1  -  /ei(i))103fc2T2Vr, 


E  =  \E 


bE(T*  +  T*)  e_  dE(T*  +  T*) 


T*  +  T*  +  I<b  T*  +  T*  +  Kd 


E  -  SeE, 


together  with  an  initial  condition  vector 


{Ti (0) ,  T*  (0) ,  T2(0) ,  T*  (0) ,  Vf  (0) ,  E{ 0)) T. 


(17) 


The  differences  in  infection  rates  and  treatment  efficacy  help  create  a  low,  but  non-zero, 
infected  cell  steady  state  for  T2,  which  is  compatible  with  the  idea  that  macrophages  or 
memory  cells  may  be  an  important  source  of  virus  after  T-cell  depletion.  The  populations 
of  uninfected  target  cells  T\  and  T2  may  have  different  source  rates  A  $  and  natural  death 
rates  dt .  The  time-dependent  treatment  factors  e\ (t)  =  e\ u(t)  and  e2(t)  =  e2u(t )  represent 
the  effective  treatment  impact  of  a  reverse  transcriptase  inhibitor  (RTI)  (that  blocks  new 
infections)  and  a  protease  inhibitor  (PI)  (which  causes  infected  cells  to  produce  non-infectious 
virus),  respectively.  The  RTI  is  potentially  more  effective  in  type  1  target  cells  (Ti  and  T±) 
than  in  type  2  target  cells  (T2  and  T2),  where  the  efficacy  is  fe  1,  with  /  £  [0, 1].  The  relative 
effectiveness  of  RTIs  is  modeled  by  ei  and  that  of  Pis  by  e2,  while  the  time-dependent 


treatment  function  0  <  u(t)  <  1  represents  therapy  levels,  with  u(t)  =  0  for  fully  off  and 
u(t)  =  1,  for  fully  on.  Although  HIV  treatment  is  nearly  always  administered  as  combination 
therapy,  the  model  allows  the  possibility  of  mono-therapy,  even  for  a  limited  period  of  time, 
implemented  by  considering  separate  treatment  functions  u\(t),u2{t)  in  the  treatment  factors. 

As  in  [1,  4],  for  our  numerical  investigations  we  consider  a  log-transformed  and  reduced 
version  of  the  model.  This  transformation  is  frequently  used  in  the  HIV  modeling  literature 
because  of  the  large  differences  in  orders  of  magnitude  in  state  values  in  the  model  and  the 
data  and  to  guarantee  non-negative  state  values  as  well  as  because  of  certain  probabilis¬ 
tic  considerations  (for  further  discussions  see  [4]).  This  results  in  the  nonlinear  system  of 
differential  equations 


Xl 

i'2 

X3 

24 

X5 

X6 


1  rr*1  /  \ 

-  dilO*1  -  (1  -£!(*)) AqlO^lO*1), 

^^((1  -ei^fcilO^lO*1  -S10X2  -milO^lO12), 

^(A2  -  d2 10-3  -  (1  -  fs1(t))k2ioxnox*y 


-  f£i{t))k2ioxnoX3  -  siox*  -  m2ioxnox^, 

1  n~x5  / 

j^y((l  -  e2(t))l03NT5(l0X2  +  10*4)  -  clO*5 

-  (1  -ei(t))/0ilO3fcilOXllO*5  -  (1  -  p2103k210X310X5^J , 

^—(x  1  b^WX2  +  1°X4)  IQ.. 

ln(10)  V  E  10*2  +  10*4  +  Kb 


dE(  lO^  +  lO-4) 
1CP2  +  10* 4  +  Kd 


<^10*6) , 


where  the  changes  of  variables  are  defined  by 


(18) 


T\  =  io*1,  t;  =  ioX2,  t2  =  ioX3,  t2*  =  10*4,  v 1  =  iox 5,  e  =  ioX6. 

The  initial  conditions  for  equations  (18)  are  denoted  by  2,(fo)  =  x®,  i  =  1, ...  ,6.  We  note 
that  this  model  has  six  state  variables  and  the  following  22  (in  general,  unknown)  system 
parameters  in  the  right-hand  sides  of  equations  (18) 

Ai,  di,  ei,  ki,  X2,  d2 ,  /,  k2,d,  mi,  m2, .  • . 

. . .  e2,  Nt,  c,  pi,p2,  A Ei  bE,  Kb,  dE,  Kd,  SE. 


We  will  also  consider  the  initial  conditions  as  unknowns  and  thus  we  have  28  unknown 
parameters  which  we  collect  in  the  parameter  vector  9, 

9  =  (2?,  2°,  2°,  2°,  2°,  2g,  A1;  di,  ei,  k\,  \2,  d2,  /,  k2, 5,  mi,  m2,  - .  • 

. . .  e2,  Nt,  c,  pi,p2,  XE,  bE,  Kb,  dEl  Kd,  5E)T . 

A  list  of  the  parameters  in  the  model  equations  along  with  their  units  is  given  below  in 
Table  1. 

As  reported  in  [1,  4],  data  to  be  used  with  this  model  in  inverse  or  parameter  estimation 
problems  typically  consisted  of  observations  for  T\  +  Tj*  and  V  over  some  extended  time 
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Table  1:  Parameters  in  the  equations  of  the  HIV  model. 


Parameter 

Units 

Description 

Ai 

cells/ (ml  day) 

production  rate  of  type  1  target  cells 

di 

1/day 

death  rate  of  type  1  target  cells 

ei 

— 

treatment  efficacy  of  type  1  target  cells 

k  i 

ml/ (virion  day) 

infection  rate  of  type  1  target  cells 

A2 

cells/(ml  day) 

production  rate  of  type  2  target  cells 

di 

1/day 

death  rate  of  type  2  target  cells 

f 

— 

reduction  of  treatment  efficacy  for  type  2  target  cells 

h2 

ml/ (virion  day) 

infection  rate  of  type  2  target  cells 

5 

1/day 

death  rate  of  infected  cells 

m  1 

ml/(cell  day) 

immune-induced  clearance  rate  for  type  1  target  cells 

m2 

ml/(cell  day) 

immune-induced  clearance  rate  for  type  2  target  cells 

£2 

— 

treatment  efficacy  for  type  2  target  cells 

Nt 

virions/cell 

virions  produced  per  infected  cell 

c 

1/day 

natural  death  rate  of  viruses 

Pi 

virions/cell 

average  number  of  virions  infecting  a  type  1  cell 

p2 

virions/cell 

average  number  of  virions  infecting  a  type  2  cell 

A  E 

cells/ (ml  day) 

production  rate  for  immune  effectors 

bE 

1/day 

maximum  birth  rate  for  immune  effectors 

Kb 

cells/ml 

saturation  constant  for  immune  effector  birth 

dE 

1/day 

maximum  death  rate  for  immune  effectors 

Kd 

cells/ml 

saturation  constant  for  immune  effector  death 

5e 

1/day 

natural  death  rate  for  immune  effectors 

period.  For  the  purpose  of  this  paper  we  are  only  using  the  data  for  T\  +  T*  in  case  of 
patient  #  4  which  we  depict  in  Figure  1  together  with  the  corresponding  model  output  for 
the  parameters  identified  in  [1,  4].  Thus  our  observations  are 

f(U;  Oq)  =  log10  +  lO*2^)  ,  (19) 

where  the  nominal  parameter  vector  Oq  is  given  at  the  beginning  of  Subsection  5.1. 

While  the  inverse  problem  we  are  considering  in  this  paper  for  the  HIV-model  is  relatively 
“small”  compared  to  many  of  those  found  in  the  literature,  it  is  still  represents  a  nontrivial 
estimation  challenge  and  is  more  than  sufficient  to  illustrate  the  ideas  and  methodology 
we  discuss  in  this  presentation.  Other  difficult  aspects  (censored  data  requiring  use  of  the 
Expectation  Maximization  algorithm  as  well  as  use  of  residual  plots  in  attempts  to  validate 
the  correctness  of  choice  of  corresponding  statistical  models  introduced  and  discussed  in 
Section  2)  of  such  inverse  problems  are  discussed  in  the  review  chapter  [7]  and  will  not  be 
pursued  here. 
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Patient  #4 


Figure  1:  Log-scaled  data  {yj}  for  CD4+  T-cells  of  Patient  #4  (crosses),  and  model  output  z(t)  (solid  curve) 
evaluated  at  the  parameter  estimates  obtained  in  [1,  4], 


4.2  A  model  for  the  reaction  of  the  cardiovascular  system  to  an  ergometric 
workload 

As  a  second  example  to  illustrate  our  methods  we  chose  a  model  for  cardiovascular  function. 
The  model  was  developed  in  order  to  describe  the  reaction  of  the  cardiovascular  system  to  a 
constant  ergometric  workload  of  moderate  size.  The  building  block  of  the  model  are  the  left 
ventricle,  the  arterial  and  venous  systemic  compartments  representing  the  systemic  circuit 
as  well  as  the  right  ventricle,  arterial  and  venous  pulmonary  compartments  representing  the 
pulmonary  circuit.  The  model  is  non-pulsatile  and  includes  the  baroreceptor  loop  as  the 
essential  control  loop  in  the  situation  to  be  modeled.  The  feedback  control  which  steers  the 
system  from  the  equilibrium  corresponding  to  rest  to  the  equilibrium  corresponding  to  the 
imposed  workload  is  obtained  by  solving  a  linear  quadratic  regulator  problem.  Furthermore 
the  model  includes  a  sub-model  describing  the  so  called  autoregulation  process,  i.e.,  the 
control  of  local  blood  flow  in  response  to  local  metabolic  demands.  The  model  equations  are 
given  as  (for  details  see  [45],  [16,  Chapter  1]): 
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with 


where 


Cas 

[Qe 

1 

(  1 

^vs 

VPs 

1 

(  Q 

Cap 

1  Vr 

>  _  p 

as  1  vs )  1 1 


Pas  = 

Pvs  =  - (  ~S~(Pas  ~  Pvs)  —  Qr  ) , 


v-4  ap  vp  \  as  5  vs  •>  ap  )  )  J  ) 

=  Oy, 

o-£  =  -ayS^  -  7^oy  +  fyH, 

Sr  =  Or, 

dr  =  -arSr  -  7rcrr  +  (3tH, 

Rs  =  ^  (Apesk(PasRPvsCa, q2  -M0-pW)~  (Pas  -  Pv; 

=  'u(t) 


-fvp  —  -fvp(-fas5  P vsi  P =ip)  • —  (^tot  Ca,sP as  CvsP vs  Cap-^ ap)  ? 


-vp 


Qe  =  H 

Qr  =  H 


C^Pvp(Pas)  Pvs,  Pip ) *-7: (  P) ^7 

aKP)Pas  +  ’ 

CrPvs^r(P)  hV 


ar(P)Pap  +  fcr(p-)Pr’ 

Ay(P)  =  e 

kr(H) 


,-{ctRe)  Pd(-ff)  and  ^(iJ)  =  i  _ 

-{cTRr)~Hd(H)  and  ar(H)  =  i-kr(H). 


(20) 


(21) 


(22) 


For  the  duration  trj  of  the  diastole  we  use  Bazett’s  formula  (duration  of  the  systole  =  n/H 1//2) 
which  implies 

td  =  td(H)  =  ~  K)'  (23^ 

Introducing  x  =  (Pas,  Pvs,  Pap,  ft,  oy,  Sr,  Or,  Rs,  H)J  G  M9  system  (20)  can  be  written  as 

x(t)  =  f(x(t),W,9,u(t)), 

where  W  =  VFrest  =  0  (Watt)  for  t  <=  0  and  W  =  Wexer  =  75  (Watt)  for  t  >  0.  Moreover,  9 
is  the  parameter  vector  of  the  system.  We  distinguish  two  values  for  each  of  the  parameters 
Rv  and  Apesk,  one  for  the  resting  situation  and  one  for  the  exercise  situation.  Consequently 
we  have  the  parameters  Ppest,  and  Ppxer  and  instead  of  Pp  and  Apesk-  The  initial 

value  for  system  (20)  is  the  equilibrium  xrest,  which  is  computed  from  /(xrest,  WTest,9,  0)  =  0. 
Analogously  xexer  is  the  equilibrium  corresponding  to  the  constant  workload  lFexer  (satisfying 
/(xexer,  WexeT,  9, 0)  =  0). 

Let  B  =  (0, . . . ,  0,  l)T  E  M9,  C  =  (1,0,...,  0)  G  Mlx9  and  A(9)  =  mx'WdT'm \x=x^^ 
the  Jacobian  of  the  right-hand  side  of  system  (20)  at  the  equilibrium  xexer.  The  control  u(t) 
is  obtained  as  the  solution  of  the  linear-quadratic  regulator  problem  for  the  linear  system 

i(t)  =  A(0)£(t)  +  Bu(t),  £(0)  =  xrest  -  xexer,  (24) 
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where  we  have  set  £(t)  =  x(t)  —  xexer,  and  the  quadratic  cost  functional 

roc 

•/(»)=/  (ai(PUt)-PZ€,f  +  u(tf)dt.  (25) 

JO 

This  functional  reflects  the  fact  that  the  baroreceptor  loop,  which  is  the  basic  control  loop 
for  the  situation  we  are  considering  here,  generates  the  control  using  the  arterial  systemic 
pressure  Pas(t). 

According  to  the  theory  of  the  linear-quadratic  control  problem  u(t)  is  given  by 

u(t)  =  -BJX£{t)  =  -BJX{x{t)  -  zexer),  t  >  0,  (26) 

where  A'  =  X{9)  is  the  solution  of  the  Riccati  matrix  equation  XA(0)  +  A(9)TX  —  XBBTX  + 
CTC  =  0.  The  feedback  control  (26)  is  also  a  stabilizing  control  for  system  (20),  i.e. ,  we  have 
lirn^oc  x(t)  =  xexer  provided  ||x(0)  —  xexei  ||2  is  sufficiently  small. 


Table  2:  Parameters  of  the  CVS-model. 


Parameter 

Units 

Description 

Cas 

liter/mmHg 

compliance  of  the  arterial  systemic  compartment 

Cvs 

liter/mmHg 

compliance  of  the  venous  systemic  compartment 

Cap 

liter/mmHg 

compliance  of  the  arterial  pulmonary  compartment 

Cvp 

liter/mmHg 

compliance  of  the  venous  pulmonary  compartment 

Cl 

liter/mmHg 

compliance  of  the  relaxed  left  ventricle 

Cr 

liter/mmHg 

compliance  of  the  relaxed  right  ventricle 

Rot 

liter 

total  blood  volume 

Rv 

mmHg  min/liter 

resistance  in  the  peripheral  region  of  the  pulmonary  circuit 

Re 

mmHg  min/liter 

inflow  resistance  of  the  left  ventricle 

Rr 

mmHg  min/liter 

inflow  resistance  of  the  right  ventricle 

K 

min1/2 

coefficient  in  Bazett’s  formula  (see  (23)) 

Pa,02 

1 

C>2-concentration  in  arterial  systemic  blood 

I< 

liter 

constant  in  the  formula  for  the  biochemical  energy  flow, 

Mb  =  -KdCv,o2/dt 

Apesk 

mmHg  min/liter 

constant  in  the  formula  relating  peripheral  systemic  resistance  and 
venous  O2  concentration  (Rs  =  ApeskCV.cu) 

M0 

liter/min 

metabolic  rate  in  the  systemic  tissue  region  corresponding  to  zero 
workload 

P 

liter/ (min  Watt) 

coefficient  of  W  in  the  differential  equation  for  Rs 

Qas 

min-  2  (mmHg) - 1 

weighting  factor  for  Pas  in  the  cost  functional  (25) 

ae 

min-2 

coefficient  of  St  in  the  differential  equation  for  St 

a  r 

min-2 

coefficient  of  Sr  in  the  differential  equation  for  Sr 

Ue 

mmHg/min 

coefficient  of  H  in  the  differential  equation  for  St 

Ur 

mmHg/min 

coefficient  of  H  in  the  differential  equation  for  Sr 

■ye 

min-1 

coefficient  of  St  in  the  differential  equation  for  St 

7r 

min-1 

coefficient  of  ST  in  the  differential  equation  for  Sr 

The  parameter  vector  of  the  system  is 


q  =  (ce,cT,  ^as  5  Cvs  i  Cap  5  Cvp  •>  Ri  i  Rr  ?  ^i  i  C^r  i  Pi  5  Pt  i  ^1i  j  •  •  • 

•  •  •  7r,  K,  K,  Mo,  p,  Ca, Q2 , 9as,  Vtat,  b-r^P-k,  r- ^pesk)T  ^  K25. 


(27) 


Tables  2  and  3  contain  the  descriptions  and  units  for  the  parameters  q  and  the  state  variables 
x,  respectively,  of  the  system. 
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Table  3:  The  state  variables  and  other  variables  of  the  CVS-model. 


Variable 

Unit 

Description 

P 

-1  as 

mmHg 

pressure  in  the  arterial  systemic  compartment 

p 

j-  vs 

mmHg 

pressure  in  the  venous  systemic  compartment 

p 

t  ap 

mmHg 

pressure  in  the  arterial  pulmonary  compartment 

Pvp 

mmHg 

pressure  in  the  venous  pulmonary  compartment 

se 

mmHg 

contractility  of  the  left  ventricle 

mmHg/min 

time  derivative  of  St 

Sr 

mmHg 

contractility  of  the  right  ventricle 

CTr 

mmHg/min 

time  derivative  of  ST 

Rs 

mmHg  min/liter 

peripheral  resistance  in  the  systemic  circuit 

H 

min-1 

heart  rate 

Qe 

liter/min 

cardiac  output  of  the  left  ventricle 

Qr 

liter/min 

cardiac  output  of  the  right  ventricle 

W 

Watt 

workload  imposed  on  the  test  person 

5  Results  and  Discussion 

5.1  The  HIV-model 

As  the  nominal  parameter  vector  we  take  the  estimates  obtained  in  [1,  4]  for  Patient  # 
4.  More  precisely,  we  assume  that  the  error  variance  is  a q  =  0.11,  and  that  the  nominal 
parameter  values  (for  description  and  units  see  Table  1)  are  given  as: 

xi  =  logi0(1.202e+3),  x\  =  log10(6.165e+l),  x°  =  log10(1.755e+l), 

x°  =  log10(6.096e-l),  x°  =  log10(9.964e+5),  2g  =  log10(1.883e-l), 

Ai  =  4.633,  d±  =  4.533  x  10-3,  a  =  6.017  x  10~\ 

ki  =  1.976  x  10“6,  A2  =  1.001  x  10"1,  d2  =  2.211  x  10"2, 

/  =  5.3915  x  HT1,  k2  =  5.529  xl0“4,  5  =  1.865  x  11T1, 

m i  =  2.439  x  11T2,  m2  =  1.3099  x  11T2,  e2  =  5.043  x  HT1, 

Nt  =  1.904  x  101,  c  =  1.936  x  101,  p\  =  1.000, 

p2  =  1.000,  A E  =  9.909  x  10“3,  bE  =  9.785  x  10“2, 

Kb  =  3.909  x  10-1,  dE  =  1.021  x  10_1,  Kd  =  8.379  x  10_1, 

SE  =  7.030  x  10“2. 

In  Figure  1  above  we  depicted  the  log-scaled  longitudinal  observations  {?/*}  on  the  number 
of  CD4+  T-cells  and  the  model  output  z(tj)  =  f(tj,9o),  j  =  1 ,  ...,1V,  evaluated  at  the 
nominal  parameter  vector  and  given  in  (19). 

It  is  assumed  that  the  following  parameters  are  always  fixed  at  the  values  given  above: 

2°,  x°,  x°,  pi,  and  p2. 

In  other  words,  the  parameters  to  be  selected  for  estimation  will  always  constitute  a  sub¬ 
vector  of 

q  =  (xj,  22,  x3,  Ai,  di,  ei,  ki,  A2,  d2,  f,k2,  S,. . . 

. . .  mi,m2,  e2,  NE,  c,  XE,  bE,  Kb,  dE,  Kd,  SE)  G  M23.  (28) 
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Let  <7-23  denote  the  Fisher  information  matrix  of  system  (18)  for  the  23  parameters  of  q 
as  given  in  (28)  at  their  nominal  values  go  as  given  above.  Then  we  have 

cond ^23(^0)  =  1-712  x  1024, 

i.e.,  JF2s(go)  is  non-singular,  but  ill-conditioned.  Since  ^23(90)  is  non-singular,  the  Fisher 
information  matrix  for  any  sub-vector  9  of  q  at  the  corresponding  nominal  parameter  values 
is  also  non-singular.  Consequently  the  regularity  check  for  the  Fisher  information  matrix  in 
the  subset  selection  algorithm  can  be  deleted  in  case  of  the  HIV-model. 


Table  4:  The  top  five  parameter  vectors  obtained  with  subset  selection  algorithm  for  p  =  11.  For  each  parameter 
vector  9  the  condition  number  k(J-{9o))  of  the  Fisher  information  matrix  and  the  selection  score  a(#o)  are  displayed. 
The  next  two  lines  show  k(J-(9o))  and  a(6o)  for  the  sub-vector  6  £  R9  which  is  common  to  the  top  five  parameter 
vectors  and  for  the  optimal  parameter  vector  in  R9.  The  last  line  presents  the  sub-vector  in  R11  with  the  largest 
selection  score. 


Parameter 

vector 

e 

k(H0o)) 

a{80) 

(*?, 

x°, 

^1> 

di. 

£1, 

A2, 

d2 , 

fc2. 

,  5, 

C2, 

Nt) 

9.841xl010 

2.881X101 

(*?, 

X°5, 

di. 

£1, 

A2, 

d2 , 

fc2. 

.  s, 

C2, 

c) 

9.845xl010 

2.883X101 

(*?, 

X°5, 

di. 

ei, 

fci, 

A2, 

da. 

,  ko 

!,  <5, 

£2) 

4.388xl016 

2.896X101 

(xl 

x°, 

di. 

ei, 

A2, 

d2, 

fc2. 

,  <5, 

C2, 

Nt) 

9.235xl010 

2.904X101 

IT 

too 

X5, 

di, 

^2, 

d2 , 

fc2. 

,  <5, 

e2, 

c) 

9.241xl010 

2.906X101 

(xs, 

^1) 

di, 

^2, 

d2, 

k2, 

<5, 

£2) 

9.083xl010 

2.193X101 

(*?, 

x°s, 

^1> 

di, 

ki. 

d2. 

k2, 

8-i 

£2) 

4.335x  1015 

1.050X101 

(da, 

k2, 

S,  m2, 

Nt, 

A  E 

,  bE 

,  I<b, 

d_E, 

Kd,  5e) 

7.247x  1017 

2.340xl05 

In  [1,  4],  the  authors  estimate  the  parameter  vector 

6  =  (x^x^x9,  Ai,di,ei,  ki,e2,NT,c,  bE)J  G  M11.  (29) 

The  selection  score  for  this  parameter  vector  is  a(6o)  =  4.611  x  103.  In  Table  4  we  display, 
for  the  five  selections  of  sub- vector  6  €  M11  of  q  with  the  minimal  selection  scores,  the 
condition  numbers  of  the  corresponding  Fisher  information  matrices  and  the  selection  scores. 
In  addition  we  also  show  the  selection  0  £  M11  with  the  maximal  selection  score.  As  we  can 
see,  the  selection  score  values  range  from  2.881  x  101  to  2.340  x  105  for  the  (^3)  =  1352  078 
different  parameter  vectors  in  M11  which  can  be  selected  from  the  23  parameters  in  (28). 

As  we  also  can  see  from  Table  4  that  the  selection  algorithm  chooses  most  of  the  parame¬ 
ters  in  the  vector  (29).  For  instance,  the  sub- vector  (x9,  Ai,  d\,  £1,62)  of  (29)  appears  in  every 
one  of  the  top  five  parameter  vectors  displayed  in  Table  4.  In  fact  the  top  five  parameter 
vectors  have  the  sub-vector  (x9,  Ai,  d\,  e±,  A2,  di,  &2,  <5,  62)  €  M9  in  common  and  differ  only  by 
one  or  two  of  the  parameters  x9,  x9,  £q,  c,  and  Nt-  Use  of  the  subset  selection  algorithm 
discussed  here  (had  it  been  available)  might  have  proved  valuable  in  the  efforts  reported  in 

[1.  4]. 

In  Figure  2(a)  we  depict  the  minimal  selection  score  as  a  function  of  the  number  of 
parameters.  Table  5  contains  the  values  of  the  corresponding  selection  scores.  Figure  2(b)  is 
a  semilog  plot  of  Figure  2(a),  i.e.,  it  displays  the  logarithm  of  the  selection  score  as  a  function 
of  the  number  of  parameters.  Figure  2,  (b),  suggests  that  parameter  vectors  with  more 
than  thirteen  parameters  might  be  expected  to  have  large  uncertainty  when  estimated  from 
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(a) 


(b) 


Figure  2:  (a)  Minimal  selection  scores  (crosses)  and  exponential  approximations  (30)  (grey  solid  line)  respectively 
(31)  (grey  dashed  line)  versus  the  number  of  parameters  p.  (b)  Logarithm  of  minimal  selection  scores  (crosses) 
and  regression  lines  corresponding  to  (30)  (gray  solid  line)  respectively  to  (31)  (grey  dashed  line)  versus  number  of 
parameters  p. 
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Table  5:  Minimal  selection  scores  amin(p)  for  sub-vectors  in  Rp  of  (28),  p  — 1,2,...,  22. 


p 

1 

2 

3 

4 

5 

6 

7 

8 

Q^min  (jp) 

0.0340 

0.0523 

0.1203 

0.1679 

9.3796 

0.6511 

1.1375 

2.4166 

P 

9 

10 

11 

12 

13 

14 

15 

16 

^min  ip) 

10.503 

17.482 

28.81 

92.91 

243.34 

336.77 

566.86 

2  274.3 

P 

17 

18 

19 

20 

21 

22 

C^min  ip) 

3  372.4 

5  047.9 

9  664.4 

16  585 

51  631 

95  128 

observations,  because  the  minimal  selection  score  is  already  larger  than  100.  Figure  2(b)  also 
depicts  the  regression  line,  which  fits  the  logarithm  of  the  selection  score.  From  this  linear 
regression  we  conclude  the  selection  score  amjn(p)  grows  exponentially  with  the  number  p  of 
parameters  to  be  estimated.  More  precisely,  we  find 

amin{p)  ~  0.01133ea728p,  p  =  1, ...  ,22.  (30) 

Computing  the  minimal  selection  scores  for  p  =  1, . . . ,  22  requires  one  to  consider  all  possible 
choices  for  sub-vectors  of  (28)  in  Mp,  p  =  1,...,22,  i.e.,  to  consider  Ci )  =  8  388  605 

cases.  If  we  determine  the  regression  line  only  using  the  minimal  selection  scores  for  p  = 
1,2, 3, 20,  21,  22  we  obtain 

amin(p)  ~  0.01441e°'710p,  p  =  1, . . . ,  22.  (31) 


Table  6:  Time  for  computing  9  G  Rp  with  the  minimal  selection  score  on  a  laptop  computer,  once  the  23  x  23 
Fisher  information  matrix  for  the  HIV-model  has  been  computed. 


P 

1 

2 

3 

4 

5 

6 

7 

8 

time  (sec) 

0.054 

0.015 

0.075 

0.472 

1.66 

5.52 

14.05 

29.81 

P 

9 

10 

11 

12 

13 

14 

15 

16 

time  (sec) 

53.49 

80.02 

100.4 

105.8 

96.74 

74.01 

47.15 

24.09 

P 

17 

18 

19 

20 

21 

22 

time  (sec) 

10.69 

3.64 

1.02 

0.211 

0.034 

0.0041 

Computing  the  minimal  selection  scores  needed  for  (31)  requires  to  consider  only  (2|3)  + 
(?)  +  (3)  +  ©  +  ©  +  ©  =  2((?)  +  (?)  +  (3))  =4  094  cases.  In  Figures  2(a),  we  show 
the  curves  given  by  (30)  and  (31),  whereas  in  Figures  2(b),  also  the  corresponding  regression 
lines  are  depicted.  Table  6  shows  the  time  it  takes  on  a  laptop  computer  with  an  Intel® 
Core™ 2  Duo  processor  using  a  MATLAB  programm  to  compute  amin(p),  P  =  1 ,  -  -  - ,  22, 
once  the  23  x  23  Fisher  information  matrix  for  the  nominal  parameter  vector  go  has  been 
computed. 

Figure  3  is  the  same  as  Figure  2(b),  but  for  p  =  1, . . . ,  11.  The  curves  (30)  respectively 
(31)  can  be  used  to  determine  p  such  that  the  selection  score  amjn  (p)  is  smaller  than  a 
given  upper  bound.  From  Figure  3  we  can  see  that  in  order  to  have  ami n(p)  <  5  we  should 
choose  p  <  8,  which  is  correct  according  to  each  of  the  two  curves  (see  Table  5).  In  order  to 
have  amin  <  10  the  curves  suggest  p  <  9,  which  is  not  quite  correct,  because  according  to 


17 


Figure  3:  Minimal  selection  scores  (crosses)  and  and  exponential  approximation  (30)  (grey  solid  line)  for  p  = 
1,  -  -  - ,  11- 


Table  7:  The  top  5  parameter  vectors  0  £  R11  selected  according  the  the  criterion  of  minimal  condition  number 
for  the  Fisher  information  matrix. 


Parameter  vector  9 

k(T(0o)) 

a(8o) 

{x°,  x°,  Ai,  ei,  /,  mi,  m2,  NT,  A E,  dE,  SE ) 

1.735x10® 

1.908xl03 

(x?,  x°,  Ai,  ei,  /,  mi,  m2,  NT,  A E,  bE,  SE) 

1.738x10® 

1.897xl03 

{x°,  x°,  Ai,  ei,  /,  mi,  m2,  c,  \E,  dE,  SE) 

1.744x10® 

1.908  x10s 

{x°,  x2,  Ai,  ei,  /,  mi,  m2,  c,  \E,  bE,  SE) 

1.747x10® 

1.896x10s 

(x°,  x°,  Ai,  ei,  f,  mi,  m2,  c,  \E,  dE,  SE) 

1.788x10® 

1.910x10s 

Table  5  we  have  amin(9)  =  10.5,  so  that  we  should  choose  p  <  8.  In  Figure  4  we  graph  (in 
logarithmic  scales)  the  condition  number  k(J-p(0q))  of  the  corresponding  Fisher  information 
matrix  versus  the  smallest  selection  score  amin(p)  for  p  =  1, . . . ,  22.  It  is  clear  from  Figure  4 
that  the  condition  numbers  for  the  Fisher  information  matrix  corresponding  to  the  selected 
parameter  vector  9  E  does  not  show  a  monotone  behavior  with  respect  to  p.  We  could 
also  determine  the  selection  of  parameters  according  to  the  criterion  of  minimal  condition 
number  for  the  corresponding  Fisher  information  matrix.  In  Table  7  we  present  the  best 
5  selections  0  £  l11  according  to  this  criterion  together  with  the  condition  numbers  of  the 
Fisher  information  matrix  and  the  corresponding  selection  scores. 

In  Table  8  we  examine  the  effect  that  removing  parameters  from  an  estimation  has  in 
uncertainty  quantification.  The  coefficient  of  variation  (CV)  is  shown  for  each  parameter 
(see  (16)).  Five  cases  are  considered: 

(i)  The  parameter  vector 

9{lb)  =  (,X2,  Xg,  Ai,  d\,  ei,ki,  A2,  d2,  f,  k2,  mi,  e2,  c,  \E,  dE ,  Kd,  5E)T , 
which  is  the  sub-vector  in  M18  with  the  minimal  selection  score. 

(ii)  The  parameter  vector 

0(5,1)  =  Ai,di,  Aq,e2)T, 
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Figure  4:  Condition  number  k(J-p{0q))  versus  minimal  selection  score  amin(p)  for  the  HIV-model,  where  6  £  Rp, 
is  the  sub-vector  of  (28)  with  the  minimal  selection  score,  p  =  1, . . . ,  22.  Both  axes  are  in  logarithmic  scale. 

which  is  the  sub-vector  of  in  R5  with  the  minimal  selection  score. 

(iii)  The  parameter  vector 

e(  5’2)  =  {x°l,\1,kl,6,e2)T, 

which  is  the  sub-vector  in  R5  of  q  as  given  by  (28)  with  the  minimal  selection  score. 

(iv)  The  parameter  vector 

0(5,3)  =  (ei,  A2,mi,e2,  As)T, 

which  is  the  sub-vector  of  #(18)  in  R5  with  the  maximal  selection  score. 

(v)  The  parameter  vector 

=  (mi)  bE,  Kb,  d,E,  Kd)J , 

which  is  the  sub-vector  in  R5  of  q  as  given  by  (28)  with  the  maximal  selection  score. 

We  see  that  here  are  substantial  improvements  in  uncertainty  quantification  when  considering 
0(5,1)  insteaci  Qf  0( 18).  However,  just  taking  a  considerably  lower  dimensional  sub- vector  of 
0(18)  in  R5  does  not  lead  necessarily  to  a  drastic  improvement  of  the  estimate. 
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*1 

— 

— 

3.80 

— 

*5 

1.58X101 

3.43xl0_1 

Ai 

8.19X10-1 

3.56X10"1 

di 

g.soxio^1 

3.94x  10_1 

€l 

1.26x  102 

— 

ki 

7.67X102 

8.17xl0“2 

A2 

4.74X101 

C?2 

4.62X101 

— 

/ 

2.51x  102 

— 

fc2 

7.53X102 

— 

5 

— 

— 

mi 

2.29x10s 

— 

€2 

1.63xl02 

l.llxlO-1 

c 

7.74xl02 

— 

A  E 

2.18x10s 

— 

Be 

— 

— 

Kb 

2.55x10s 

— 

dE 

4.56xl02 

— 

Kd 

1.98x10s 

— 

8e 

1.72x10s 

— 

0(5,2) 

0(5,3) 

0(5, 

4.09xl0~2 

— 

— 

1.13X10"1 

— 

— 

— 

— 

— 

— 

8.49 

— 

9.57X10”2 

— 

— 

9.99 

— 

— 

— 

— 

— 

— 

3.29X10”1 

— 

— 

— 

9.85X102 

2.10x 

9.33X10'2 

6.34 

— 

— 

9.83X102 

— 

— 

— 

1.22x 

4.62  x 

— 

— 

1.16x 

— 

— 

4.40x 

5.2  The  CVS-model 


For  this  model  we  take  as  the  nominal  parameters  the  following  estimates  obtained  in  [45] 
using  data  obtained  at  bicycle  ergometer  tests  (for  a  description  and  units  see  Table  2): 


cl 

=  0.02305, 

Cr  = 

^vs 

=  0.6500, 

Cap  — 

Re 

=  0.2671, 

rt  = 

Q!f 

=  28.6785, 

Pe  = 

le 

=  -1.6744, 

7r  = 

0.04413,  cas  =  0.01016, 

0.03608,  cvp  =  0.1408, 

0.04150,  at  =  30.5587, 

25.0652,  ft  =  1.4132, 

-1.8607,  K  =  16.0376, 


k  =  0.05164,  M0  =  0.35, 

Ca,  o2  =  0.2,  qas  =  163.047, 

Rrpest  =  1.5446,  Ar™*k  =  177.682, 
^-pesk  =  254.325. 


p  =  0.011, 

Hot  =  5.0582, 
Rexer  =  0.3165, 


For  the  estimates  given  above  measurements  for  Pas,  H  and  Qt  were  available.  The  sampling 
times  tj  for  the  measurements  of  Pa s  and  H  were  uniformly  distributed  on  the  time  interval 
from  0  to  10  minutes  with  tj+ 1  —  tj  =  2  seconds,  i.e. ,  601  measurements  for  Pas  and  H . 
The  measurements  for  Qt  were  obtained  by  Doppler  echo-cardiography  and  consequently 
were  much  less  frequent  (only  20  measurements)  and  also  irregularly  distributed.  In  the 
following  we  shall  consider  Pas  as  the  only  measured  output  of  the  system,  i.e.,  we  have 
f(t;6)  =  Pas(t;8).  The  variance  of  the  measurement  errors  was  roughly  estimated  to  be 
al  =  10. 


Figure  5:  The  Pas-component  of  the  solution  of  system  (20)  with  the  nominal  parameter  values  and  the  Pas- 
measurements. 

The  equilibria  xrest  and  xexer  are  given  in  Table  9  (for  units  see  Table  2). 

As  for  the  HIV-model  also  in  case  of  the  CVS-model  the  Fisher  information  matrix  Jr25((/o) 
at  the  nominal  values  of  the  parameters  (27)  is  non-singular,  but  highly  ill-conditioned: 

cond^r25(Q,o)  =  2.5755  x  1031. 
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Table  9:  The  equilibria  xrest  and  a;exer  for  the  CVS-model. 


variable 

Pas 

Pva 

Pap 

Pv  p 

St 

&£ 

Sr 

<7r 

Rs 

H 

xreBt 

105.595 

4.277 

12.474 

5.367 

64.675 

0 

3.886 

0 

22.020 

78.85 

xexer 

122.115 

3.595 

10.441 

7.844 

88.092 

0 

5.293 

0 

14.445 

107.4 

Therefore  we  can  also  delete  the  regularity  test  for  the  Fisher  information  matrix  in  the 
selection  algorithm  in  case  of  the  CVS-model.  In  Figure  7  we  show  the  minimal  selection 
scores  for  p  =  3, . . .  ,20,  whereas  in  Table  10  we  list  the  minimal  selection  scores  amin(p).> 
p  =  1, . . . ,  24.  Table  10  clearly  shows  the  effect  of  the  Fisher  information  matrix  J-25  extreme 

Table  10:  Minimal  selection  scores  amin(p),  p  =  1, . . . ,  24,  for  the  CVS-model. 


V 

1 

2 

3 

4 

5 

6 

Q^min  (p) 

9.397X10'4 

2.042xl0'2 

5.796X10"2 

1.096X10”1 

2.315x  10”1 

3.299X10”1 

P 

7 

8 

9 

10 

11 

12 

C^min  (p) 

4.340X10'1 

6.251X10'1 

8.018x  10_1 

1.084 

1.723 

3.238 

P 

13 

14 

15 

16 

17 

18 

Q^rnin  (p) 

6.336 

17.13 

37.31 

59.37 

2.224x  102 

4.301  xlO2 

P 

19 

20 

21 

22 

23 

24 

C^min  (p) 

1.015xl03 

1.265  x10s 

3.038x10s 

2.396x10s1 

1.834x10s4 

2.459  xlO108 

ill-conditioning.  We  see  an  extreme  increase  of  the  selection  score  from  3  •  105  at  p  =  21  to 
2.4  •  1031  at  p=22.  This  is  also  reflected  in  Figure  6  which  clearly  shows  that  there  is  no 
reasonable  approximation  of  log(am;n(p)),  p  =  1, . . . ,  24,  by  a  regression  line.  However,  a 
regression  line  makes  sense  for  p  =  1, ...  ,21  as  can  be  see  from  Figure  6.  In  Figure  7  we 
depict  the  minimal  selection  scores  o:min (p) .  p  =  2, . . . ,  19,  together  with  the  approximating 
exponential  functions 


ccmin (p)  «  0.00670e°"58°p,  p  =  1, . . . ,  19.  (32) 

obtained  from  the  regression  line  for  p  =  1, . . . ,  19  (solid  grey  line)  and 

amin(p)  ~  0.00799ea61Op,  p=l,...,19.  (33) 

obtained  for  p  =  2, 3, 4, 17, 18, 19  (dashed  grey  line). 

In  Table  12  we  present  the  sub-vector  6  £  M10  of  q  given  by  (27)  with  the  minimal 
selection  score  together  with  the  coefficients  of  variation.  In  Figure  8  we  depict  the  classical 
sensitivities  for  the  chosen  sub-vector  6  £  M10  (black  solid  lines)  and  the  sensitivities  for  the 
other  15  parameters  (grey  dashed  lines).  The  right  panel  is  a  blow  up  for  the  sensitivities  with 
values  between  —0.1  and  0.1.  We  see  that  the  algorithm  chooses  predominantly  parameters 
with  larger  sensitivities. 
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Figure  6:  Logarithm  of  the  minimal  selection  scores,  p  =  1, . . . ,  24,  for  the  CVS-model. 


Table  11:  Time  for  computing  8  £  Rp  with  the  minimal  selection  score  on  a  laptop  computer,  once  the  25  x  25 
Fisher  information  matrix  for  the  CVS-model  has  been  computed. 


p 

1 

2 

3 

4 

5 

6 

7 

8 

9 

time  (sec) 

0.0038 

0.018 

0.104 

0.564 

2.364 

7.848 

21.24 

46.60 

88.31 

P 

10 

11 

12 

13 

14 

15 

16 

17 

18 

time  (sec) 

139.7 

188.3 

211.1 

208.9 

172.9 

123.1 

71.41 

36.10 

14.92 

P 

19 

20 

21 

22 

23 

24 

time  (sec) 

5.137 

1.382 

0.329 

0.093 

0.0645 

0.0005 

Table  12:  Coefficients  of  variance  for  the  sub-vector  8  £  R1()  with  the  minimal  selection  score. 


8  £  R10 

Cl 

OLl  «r 

7 e 

7r  P 

Qas 

Vtot 

r>rest 

itp 

a  rest 
-^pesk 

CV 

0.465 

0.087  0.172 

0.353 

0.622  0.114 

0.454 

0.290 

0.248 

0.213 

C^min(lO) 

1.084 
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Figure  7:  Minimal  selection  scores  amin(p)  (upper  panel)  and  log10 (amin(p))  (lower  panel),  p  =  2, 
CVS-model. 
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Figure  8:  Classical  sensitivities  for  the  nominal  values  of  6  £  R10  with  the  minimal  selection  score  (black  solid  lines) 
and  classical  sensitivities  of  the  remaining  15  parameters  (dashed  grey  lines). 


6  Concluding  Remarks 


As  we  have  noted,  inverse  problems  for  complex  system  models  containing  a  large  number 
of  parameters  are  difficult.  There  is  great  need  for  quantitative  methods  to  assist  in  posing 
inverse  problems  that  will  be  well  formulated  in  the  sense  of  the  ability  to  provide  parameter 
estimates  with  quantifiable  small  uncertainty  estimates.  We  have  introduced  and  illustrated 
use  of  such  an  algorithm  that  requires  prior  local  information  about  ranges  of  admissible 
parameter  values  and  initial  values  of  interest  along  with  information  on  the  error  in  the 
observation  process  to  be  used  with  the  inverse  problem.  These  are  needed  in  order  to 
implement  the  sensitivity/Fisher  matrix  based  algorithm. 

Because  sensitivity  of  a  model  with  respect  to  a  parameter  is  fundamentally  related  to 
the  ability  to  estimate  the  parameter,  and  because  sensitivity  is  a  local  concept,  we  observe 
that  the  pursuit  of  a  global  algorithm  to  use  in  formulating  parameter  estimation  or  inverse 
problems  is  most  likely  a  quest  that  will  go  unfulfilled. 
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